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A  PARADIGM  TO  ASSESS  AND  EVALUATE  TOOLS  TO  SUPPORT  THE 
SOFTWARE  DEVELOPMENT  PROCESS 


INTRODUCTION 

This  study  was  part  of  a  larger  project  called  the  “ProtoTech  HiPer-D  (High  Performance  Distributed 
Computing  Programl)  Joint  Prototyping  Demonstration  Project.”  The  Prototyping  Technology 
(ProtoTech)  program  is  a  research  effort  supported  by  the  Advanced  Research  Projects  Agency  (ARPA) 
to  develop  advanced  prototyping  languages.  The  HiPer-D  Demonstration  was  a  joint  effort  of  the 
ProtoTech  community  and  the  Office  of  Naval  Research  to  explore  the  applicability  of  prototyping 
technologies  to  realistic  military  problems.  An  important  first  step  in  the  HiPer-D  project  was  to  identify 
an  appropriate  problem  for  study,  one  of  suitable  complexity  to  demonstrate  the  technology  realistically, 
yet  small  enough  to  be  addressed  by  a  limited  number  of  people  in  a  limited  period  of  time.  The 
geometric  region  (GEO)  server  problem  met  these  criteria;  it  was  also  of  particular  interest  to  the  Navy 
for  its  relevancy  to  the  planned  AEGIS  weapons  system  upgrade.  The  capability  to  assign  trackable 
objects  to  their  correct  geometric  regions  in  space  is  an  essential  function  in  the  AEGIS  system  and 
currently  spread  across  multiple  algorithms.  Consolidation  into  a  single  GEO  server  has  the  potential  for 
significant  system  improvement.  The  HiPer-D  Demonstration  comprised  several  independent  teams  of 
prototyping  experts,  each  addressing  the  GEO  server  problem  in  a  different  prototyping  environment.  In 
addition,  several  related  efforts  were  undertaken  to  examine  specific  aspects  of  the  problem. 

Our  work  was  conducted  to  assess  a  paradigm  for  evaluating  how  well  a  prototyping  tool  supports 
software  development.  Prototyping  tools  are  used  to  produce  “an  executable  unit  that  demonstrates 
particular  aspects  of  the  behavior  or  functionality  of  the  desired  software  product”  (Lee  et  al.  1994).  This 
assessment  included  data  collected  at  a  two-day  interactive  software  development  exercise  at  the  Naval 
Surface  Warfare  Center,  Dahlgren  Division  (NSWCDD)  involving  a  prototyping  tool  expert  from  Kestrel 
Institute  and  two  domain  specialists  from  NSWCDD  using  the  Kestrel  Interactive  Development  System 
(KIDS).  KIDS  was  selected  as  the  software  tool  for  this  exercise  for  several  reasons.  It  is  a  product  that  is 
relatively  mature  and  has  been  used  in  a  variety  of  domains.  It  also  is  designed  to  support  a  particular 
software  development  process.  This  enabled  us  to  make  predictions  about  the  types  of  behavior  we  would 
expect  and  thereby  assess  not  only  the  tool  but  perform  a  meta  assessment  on  the  evaluation  methodology. 
Finally,  the  tool  supports  a  software  development  process  that  is  abstract  and  closely  coupled  to  the  theory 
of  the  domain.  Paradigms  to  evaluate  software  development  at  an  abstract  level  are  not  available.  The 
focus  at  this  level  is  the  representation  of  the  domain  in  software,  and  actions  taken  to  develop  and  change 
this  domain.  This  focus  is  consistent  with  the  general  hypothesis  that  was  explored.  This  exercise 
provided  us  an  opportunity  to  further  our  research  in  the  exploration  of  design  space. 


1 ,  This  is  a  project  to  develop  a  new  architecture  for  the  next  generation  of  the  AEGIS  Weapons  System.  It  is  to  be  jointly 
developed  by  three  organizations:  Johns  Hopkins  University/ Applied  Physics  Laboratory  (JHU/APL),  General  Electric  (GE),  and 
the  Naval  Surface  Warfare  Center  at  Dahlgren,  Virginia  (NSWCDD)  supported  by  the  Computer  Sciences  Corporation  (CSC). 
The  work  was  divided  into  three  thrusts:  1)  to  evaluate  a  wide  range  of  architectures  and  combat  system  issues  and  choose  a 
recommended  architecture;  2)  to  evaluate  promising  technologies  provided  by  the  Advanced  Research  Projects  Agency  (ARPA); 
and  3)  to  evaluate  and  extend  emerging  HiPer-D  technologies,  methods,  and  tools.  The  technologies  provided  by  ARPA  under 
Thrust  2  include  the  ISIS  distributed  computing  workbench  from  Cornell  University,  INTEL’S  Paragon  supercomputer,  and  the 
MACH  microkernel  developed  by  Carnegie  Mellon  University. 


Manuscript  approved  Oct.  20, 1994. 
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EXPLORATION  OF  THE  SOFTWARE  DESIGN  SPACE 

The  general  hypothesis  being  tested  is  whether  a  tool  supports  exploration  of  the  design  space.  The 
evaluation  of  this  hypothesis  implies  an  understanding  of  what  design  space  exploration  means.  If  one 
considers  design  space  exploration  as  one  part  of  the  software  engineering  process,  then  initial  definitions 
of  this  concept  can  be  derived  from  a  framework  of  design  methodologies  developed  by  Song  and 
Osterweil  (1992).  Their  framework  is  a  method-component  hierarchy  and  includes  four  abstract  types: 
Concept,  Artifact,  Representation,  and  Action.  Using  their  notions  of  these  types,  the  following 
preliminary  definitions  were  developed. 

Definitions 

Design  space  exploration  consists  of  actions  that  produce  different  representations. 

Representations  are  descriptions  or  specifications  of  design  artifacts. 

Actions  are  physical  and/or  mental  processing  steps  used  to  produce  or  modify  an  artifact. 

These  preliminary  definitions  need  to  be  tailored  somewhat  for  the  particular  tool  that  is  evaluated. 
Accordingly,  examples  of  representations  and  actions  that  are  possible  within  the  KIDS  tool  were 
generated  to  guide  the  development  of  the  data  collection  methodology.  The  representations  created  in 
KIDS  are  the  domain  theory,  specifications,  and  programs  described  in  the  Refine  language  (upon  which 
KIDS  is  built),  and  the  Lisp  code.  Examples  of  actions  within  the  KIDS  tool  that  produce  different 
representations  are: 

Develop  theory:  produce  the  initial  representation 

Return  to  initial/previous/next  state:  reset  representation  to  another  state 

Focus  (i.e.,  select  subset  of  specification):  select  a  subset  of  the  representation  for  subsequent  actions 

Apply  transformation 

Select  inference  mode 

Fold,  unfold  a  specification 

Simplify  an  expression 

Select  algorithm  tactics 

Select  compiler 

A  record  of  the  actions  taken  while  solving  the  GEO  server  problem  was  made  by  observation  and 
audio  recording  of  the  participants’  statements,  and  by  periodic  storage  of  the  solution  file.  These  data 
were  examined  to  assess  whether  the  tool  supports  design  space  exploration.  In  order  to  develop  some 
detail  about  the  expected  effects  of  the  tool  on  exploration,  specific  hypotheses  were  generated  before  the 
data  collection  session.  These  specific  hypotheses  provided  further  guidance  about  the  type  of  data  that 
should  be  collected. 

Specific  Hypotheses 

The  following  section  presents  specific  hypotheses  (Hoi  through  Ho5),  followed  by  data  that  were 
collected  and  analyzed  to  test  the  hypotheses.  At  this  point  in  the  development  of  an  evaluation  paradigm, 
the  hypotheses  can  not  be  tested  statistically. 

Hoi:  Alternative  representations  can  be  generated  and  actions  taken  on  them. 

Distinct,  different  alternatives  will  be  represented  and  actions  taken  on  them.  The  data  to  evaluate 
this  hypothesis  will  consist  of  identifying  alternative  representations  that  are  generated  during  the 
challenge  problem  design  session.  The  issue  here  is  how  to  distinguish  between  alternatives.  In  many 


A  Paradigm  to  Assess  and  Evaluate  Tools 


3 


cases,  this  distinction  could  come  from  the  alternative  menu  options  that  are  offered  to  the  user,  such  as 
selecting  a  particular  algorithm  tactic  or  selecting  a  particular  compiler.  This  choice  will  produce  a 
different  specification.  Another  difference  in  the  representation  might  be  the  feedback  to  the  user  that 
occurs  when  the  inference  mechanism  is  operating.  The  user  has  several  options  that  will  provide 
different  types  of  feedback  and  different  opportunities  for  intervention. 

Ho2:  Design  exploration  will  involve  evaluation  and  comparison  of  alternatives. 

Evaluation  actions  will  be  taken  on  alternatives  followed  by  modification  or  generation  of  an 
alternative.  This  hypothesis  further  specifies  what  exploration  of  alternatives  means  by  focusing  on 
particular  actions — ^those  which  produce  an  evaluation  of  the  artifact.  Data  for  this  hypothesis  could  come 
from  the  final  stage  of  development,  when  the  specification  is  compiled  and  executed.  At  that  point,  we 
may  observe  the  KIDS  users  examining  software  metrics  such  as  size  and  speed.  However,  during  the 
development  process,  there  may  be  verbal  comments  alluding  to  comparisons  of  alternative  designs.  One 
of  the  windowing  configuration  options  available  in  KIDS  places  two  program  viewing  windows  side  by 
side.  Use  of  this  configuration  would  be  evidence  for  this  specific  hypothesis. 

Ho3:  Alternative  representations  may  support  evaluation  and  comparison  of  design  alternatives. 

This  hypothesis  refers  to  the  possibility  that  the  type  of  representation  may  or  may  not  support 
comparison  of  alternatives.  For  example,  an  option  in  KIDS  is  to  fold/unfold  the  specification,  producing 
less  and  greater  detail  respectively,  and  allowing  the  user  to  compare  alternatives  at  different  levels  of 
detail.  Evidence  for  this  hypothesis  will  be  actions  taken  to  change  the  representation  form,  which  do  not 
really  add  or  subtract  from  the  complete  specification. 

Ho4:  Design  space  exploration  will  consist  of  a  fan-out  generation  and  subsequent  culling  of  artifacts. 

A  history  trace  of  alternatives  generated  will  show  an  increase  in  the  number  of  alternatives,  followed 
by  a  narrowing  of  the  alternatives  toward  a  particular  solution.  This  hypothesis  is  suggested  since  it  is  an 
intuitive  model  of  the  creative  process. 

Ho5:  Design  exploration  will  involve  an  examination  of  prior  actions  and  representations. 

Evidence  of  the  user’s  examination  and  analysis  of  prior  actions  and  alternatives  will  occur  in  two 
ways  using  the  KIDS  tool.  The  first  will  be  through  conversations  in  which  the  users  will  reflect  upon  and 
comment  upon  prior  actions.  The  second  will  be  through  viewing  the  Output  History  Pane  and  taking 
actions  on  prior  derivations  listed  within  this  window. 

Hypothesis  5  is  a  simple  statement  of  a  process  that  is  complex  and  little  understood.  It  refers  to  a 
meta  level  of  software  development  in  which  a  person  is  evaluating  what  has  been  done.  This  examination 
can  be  observed  in  some  ways.  But  the  examination  is  being  done  to  support  an  understanding  of  what  has 
been  done  and  the  planning  for  further  actions  and  development  of  representations. 

Elaboration  of  this  hypothesis  can  take  on  different  forms  depending  on  whether  design  space 
exploration  is  conceived  of  as  problem  solving,  creative  construction,  invention,  etc.  This  point  returns  us 
to  the  particular  perspective  of  this  research:  to  develop  and  evaluate  a  paradigm  to  monitor  design  space 
exploration.  Once  we  have  the  capability  to  follow  a  process  of  exploring  the  design  space,  we  will  be 
able  to  be  more  specific  about  what  it  actually  is. 

USABILITY  EVALUATION 

In  addition  to  developing  a  technique  to  evaluate  how  the  tool  would  support  design  space 
exploration,  there  was  interest  in  evaluating  the  usability  of  the  tool.  The  evaluation  of  this  aspect  of 
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software  development  used  traditional  usability  procedures.  These  procedures  were  partially  tailored  for 
the  KIDS  tool.  For  example,  one  of  the  evaluation  criteria  is  user  performance  on  a  benchmark  task.  With 
KIDS,  the  primary  user  tasks  are  those  in  the  software  process  model  on  which  KIDS  is  based.  These 
tasks,  and  the  subtasks  within  each  of  them,  are  as  follows: 

Specify  domain  theory: 

Specify  functional  constraints  on  input/output  behavior 
Generate  abstract  expression 
Create  rules 
Derive  laws 

Convert  specification  to  code: 

Design  algorithm 
Simplify 

Partially  evaluate 
Refine  data  types 

The  performance  of  these  tasks  was  assessed  by  real-time  observation  and  inquiry,  and  by  a  post¬ 
session  questionnaire.  Real-time  observation  was  performed  using  the  annotation  sheet  shown  in  Fig.  1. 
Four  columns  were  used  to  record  the  time  of  the  observation,  the  current  user  task,  the  program  function 
or  window  in  which  the  user’s  actions  were  located,  and  the  details  of  the  observation.  Observations  were 
detailed  so  that  the  following  types  of  information  would  be  obtained  for  later  analysis: 

•  Task  completion  time  and  errors 

•  Number  of  options/altematives  explored 

•  Critical  incidents  (both  positive  and  negative) 

Inability  to  find/open/close/save  “work” 

Recognition  of  success 

Recognition  of  other  programming  options 

Evaluation  of  algorithm  performance  vis-a-vis  different  implementations 
Recognition  of  and  recovery  from  error 
Error-free  performance  on  first  attempt 


Time 

Task 

Location 

Annotation 

Fig.  1  —  Annotation  sheet  for  real-time  observations 


In  addition  to  observations,  occasional  inquiries  were  made  to  clarify  comments  and  activities.  In 
particular,  clarification  was  sought  on  issues  related  to  the  form  of  the  representation  that  was  being 
developed  (e.g.,  why  did  you  decide  to  choose  this  form  of  representing  the  domain?).  In  order  to  have  a 
record  to  clarify  the  observations  and  inquiries,  a  continuous  audio  recording  was  made  of  the  sessions. 

Finally,  a  post-session  questionnaire  was  developed  that  included  open-ended  questions  about  the 
strengths  and  weaknesses  of  the  tool  and  solicited  recommendations  for  improvement.  A  series  of  these 


A  Paradigm  to  Assess  and  Evaluate  Tools 


5 


questions  was  formatted  for  the  particular  design  model  in  KIDS.  These  questions  asked  how  the  tool 
supported  the  specific  phases  in  the  design  process  and  what  improvements  would  be  recommended  to 
support  these  phases.  This  tailoring  is  consistent  with  the  IEEE  standard  to  evaluate  Computer-Aided 
Software  Engineering  (CASE)  tools  as  described  below.  The  questionnaire  also  included  a  table  of  design 
heuristics  that  have  been  used  in  an  approach  called  heuristic  evaluation  to  find  usability  problems  in 
interfaces  (Nielsen  1992).  Heuristic  evaluation  uses  a  small  set  of  individuals  to  evaluate  an  interface 
according  to  design  principles. 

COMPARISON  TO  IEEE  STD  1209-1992 

IEEE  STD  1209-1992  (IEEE  1992)  outlines  a  recommended  practice  to  evaluate  and  select  CASE 
tools.  Although  the  tools  of  interest  here  might  not  be  considered  CASE  tools  in  some  respects,  it  is  useful 
to  compare  our  approach  with  this  standard  of  recommended  practice.  The  standard  outlines  several 
evaluation  process  models.  The  one  most  comparable  to  our  approach  is  the  “evaluation  for  future 
reference”  model.  In  this  model,  the  tool  is  being  evaluated  on  all  relevant  criteria  and  the  results  are 
made  available  for  future  reference.  In  the  other  models,  a  tool  selection  phase  is  involved  that  introduces 
issues  that  are  not  present  when  a  single  tool  is  being  evaluated  (e.g.,  weighting  of  selection  criteria).  Two 
steps  are  part  of  this  model:  the  development  of  tailored  criteria  and  the  evaluation  itself.  Criteria  tailoring 
is  the  selection  and  definition  of  a  set  of  criteria  whereby  the  characteristics  of  the  tool  are  quantified  and 
measured.  The  types  of  criteria  listed  in  the  standard  include  reliability,  usability,  efficiency, 
functionality,  maintainability,  profitability,  and  a  general  category.  Many  of  these  criteria  are  similar  to 
those  used  in  the  ProtoTech  challenge  problem  paradigm.  Two  of  them,  usability  and  functionality,  were 
the  key  criteria  presented  in  the  initial  briefing  of  this  proposed  paradigm  to  the  ProtoTech  community. 
Details  on  how  to  quantify  these  two  criteria  were  developed  prior  to  the  data  collection  session.  These 
details  reflected  some  tailoring  for  the  particular  tool.  Specifically,  examples  of  the  types  of  actions  that 
could  be  taken  in  the  KIDS  tool  were  listed  as  performance  indicators  to  test  the  hypotheses.  Thus,  the 
tailoring  recommended  in  the  IEEE  standard  was  partly  completed.  As  described  earlier,  usability  was 
assessed  through  observation  and  inquiry  and  through  completion  of  a  post-session  questionnaire. 

In  the  present  effort,  there  is  particular  interest  in  investigating  how  to  evaluate  exploration  of  the 
software  design  space.  Although  the  IEEE  standard  does  not  explicitly  address  this  aspect  of 
functionality,  it  does  include  some  specifics  that  would  seem  to  be  related  to  design  space  exploration.  It 
decomposes  specific  functionality  criteria  into  three  classes:  those  related  to  the  operating  environment, 
those  related  to  particular  life-cycle  phases,  and  those  common  across  all  life-cycle  phases.  The  life-cycle 
phases  are  decomposed  into  modeling,  implementation,  and  testing,  and  it  is  the  criteria  listed  under 
modeling  that  may  be  related  to  design  exploration.  Modeling  criteria  are  intended  to  assess  a  tool’s 
capability  to  support  the  identification  of  software  requirements,  to  transform  requirements  into  design, 
and  “to  express  software  design”  (italics  added  for  emphasis).  Specific  modeling  criteria  include: 
diagramming,  graphic  analysis,  requirement  specification  entry  and  editing,  requirement  specification 
language,  design  specification  entry  and  editing,  design  specification  language,  data  modeling,  process 
modeling,  simulation,  prototyping,  screen  generation,  traceability,  specification  consistency  and 
completeness  checking,  other  analyses,  and  report  writing.  Thus  the  standard  supports  the  waterfall 
software  development  model  and  places  emphasis  on  the  capability  to  transform  requirements  into  design 
specifications.  However,  it  is  useful  to  note  the  importance  the  standard  places  on  the  capability  of  a  tool 
to  support  the  expression  of  the  software  design.  Clearly,  design  space  exploration  requires  and  will  be 
enhanced  with  adequate  design  expression  tools.  The  standard  also  incorporates  these  specific  criteria 
mechanisms  to  evaluate  the  design  such  as  simulation,  data  and  process  modeling,  and  traceability.  It 
might  be  a  useful  exercise  to  tie  some  of  these  specific  criteria  into  a  prototypical  process  model  for 
design  space  exploration  showing  the  feedback  loops  and  iteration  that  would  be  expected  and 
theoretically  should  be  supported  to  enhance  the  exploration  of  the  design  space. 
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RESULTS 
Prior  Hypotheses 

The  data  for  these  results  come  from  four  sources:  the  real-time  notes  (a  transcribed  and  edited 
version  of  these  notes  is  attached  as  Appendix  A),  the  audio  tapes  (used  to  edit  and  refine  the  real-time 
observations),  the  solution  files  that  were  saved  periodically  (the  last  solution  file  is  included  in  Appendix 
B),  and  the  post-session  questionnaire  that  was  completed  by  two  persons  from  NSWCDD  (Appendix  C). 
Because  an  objective  of  this  research  is  to  refine  a  methodology  to  assess  the  functionality  and  usability  of 
prototyping  tools,  the  results  present  two  perspectives:  a  description  of  the  type  of  assessment  available  in 
the  employed  form  of  the  techniques,  and  an  assessment  of  the  techniques  themselves.  The  first 
perspective  on  the  results  is  presented  within  the  context  of  the  detailed  hypotheses  that  were  made. 

Hoi:  Alternative  representations  can  be  generated  and  actions  taken  on  them. 

An  example  of  alternative  representations  is  the  initial  and  alternative  versions  of  the  function 
HIPER-D  which  places  contacts  (objects)  into  the  particular  zones  (see  Fig.  2).  The  initial  version  simply 
specifies  the  mapping  function.  The  alternative  version  divides  this  mapping  function  into  three  cases:  the 
empty  set  of  zones,  the  singleton  set  of  zones,  and  the  remaining  cases.  The  alternative  version  was 
produced  by  applying  the  divide  and  conquer  tactic  to  the  initial  function,  followed  by  the  application  of 
simplification  tactics.  The  divide  and  conquer  tactic  generated  an  alternative  software  design,  upon  which 
further  actions  were  taken.  Therefore  at  least  one  data  point  supports  this  hypothesis. 


Initial  version 

function  HIPER-D 
( fo  :  FLATLAND  I ... ) 

returns  ( contact-map  :  map(CONTACT,  set(ZONE)) 

I  contact-map  =  {I  c  ->  {  z  I  (z:ZONE)  z  in  all-zones(fo)  &  in-zone(c,z)} 
l(c:CONTACT)  c  in  contacts(fo)  I}) 


Alternative  version 

function  HIPER-D-2  (CNTKS-7:  set(CONTACT),  ZNS-9:  set(ZONE)) 
returns 

(Z-192:  map(CONTACT,  set(ZONE)) 

I  Z-192 

=  {I  C  ->  {Z I  (Z:  ZONE) 

Z  in  ZNS-9  &  IN-ZONE(C,  Z)} 

I  (C:  CONTACT)  C  in  CNTKS-7  I}) 

=  ifCNTKS-7  =  {)then  {11} 
elseif  CNTKS-7  less!  arb(CNTKS-7)  =  { ) 
then{IC->  {ZI(Z:ZONE) 

Z  in  ZNS-9  &  IN-ZONE(C,  Z)} 

I  (C:  CONTACT)  C  in  CNTKS-7  1} 
else  let  (Y-OP-3 
:  tuple 

(set(CONTACT),  set(ZONE),  set(CONTACT),  set(ZONE)) 

=  HIPER-D-2-DECOMPOSE-USING-UNION-DESTRUCTOR 
(CNTKS-7,  ZNS-9)) 

HIPER-D-2(Y-OP-3.1,  Y-OP-3.2) 

+*  HIPER-D-2(Y-OP-3.3,  Y-OP-3.4) 


Fig.  2  —  Initial  and  final  versions  of  function  HIPER-D 
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Ho2:  Design  exploration  will  involve  evaluation  and  comparison  of  alternatives. 

Three  types  of  data  were  suggested  for  evaluating  this  hypothesis.  The  first  was  that  the  alternatives 
would  be  coded  and  executed  and  comparisons  made  using  standard  software  metrics.  This  outcome  was 
not  observed  during  the  session  since  a  fully  executable  solution  was  not  completed  in  the  time  set  aside. 
However,  given  further  time  it  is  expected  that  this  type  of  data  would  be  observed. 

The  second  type  of  data  was  verbal  comments  made  during  the  development  process.  There  were 
several  examples  of  this.  Early  discussion  focused  on  defining  the  objects  in  the  domain,  especially  the 
types  of  contacts  and  the  types  of  zones.  The  initial  list  of  zones  came  from  the  problem  description  and 
included  four  types:  WEAPON,  ENGAGEABILITY,  SLAVE-DOCTRINE,  and  TIGHT.  These  were 
abstracted  to  two  types,  WEDGED-ANNULUS  and  POLYGON,  which  were  each  to  be  defined  in  a 
manner  that  would  cover  virtually  all  cases.  The  properties  of  the  WEDGED-ANNULUS  were  defined  so 
that  WEAPON,  ENGAGEABILITY,  and  SLAVE-DOCTRINE  zones  could  be  handled  by  this  object. 
The  properties  of  POLYGON  were  defined  as  the  position  of  a  sequential  list  of  vertices  that  would 
support  the  inclusion  of  both  convex  and  concave  polygons  and  any  number  of  vertices.  The  decision  to 
opt  for  generality  was  made  in  the  specification  phase  of  design  and  was  made  with  some  verbal 
consideration  of  the  implications.  Later,  when  it  came  to  specifying  whether  a  contact  location  was  within 
a  polygon  in  the  function  IN-POLYGON,  a  particular  algorithm  had  to  be  implemented  and  the  coding 
implications  of  this  completely  generalized  alternative  became  apparent.  Much  of  the  afternoon  of  the 
second  day  was  spent  generating  code  for  an  IN-POLYGON  function.  This  coding  would  not  have  been 
necessary  if  a  library  function  were  readily  available.  In  fact,  there  is  a  C  function  that  checks  whether  a 
point  is  within  a  polygon.  The  ability  to  incorporate  this  function  as  a  call  would  have  enhanced  the 
development  of  the  generalized  solution  sought  in  this  session  and  perhaps  supported  the  further 
evaluation  and  comparison  of  generalized  designs  to  designs  that  were  more  specialized.  In  this  instance, 
design  space  exploration  would  have  been  enhanced  if  this  particular  function  could  have  been  found 
through  a  library  search  and  once  found,  quickly  incorporated  into  the  solution. 

The  third  type  of  data  for  evaluating  this  hypothesis  was  a  side-by-side  windowing  configuration. 
There  were  at  least  two  instances  of  this.  The  first  was  about  17:57  on  the  first  day  (see  Appendix  A).  The 
tutor  opened  two  editing  buffers  to  enter  constraints  on  recently  entered  types.  These  two  buffers  were 
used  to  maintain  consistency  between  an  initial  specification  and  an  elaboration  of  this  specification.  The 
second  instance  was  16:16  on  the  second  day  when  two  buffers  were  again  used.  In  this  instance,  the 
buffers  were  used  to  implement  and  test  the  code.  In  this  instance,  additional  functions  were  created  that 
had  to  be  consistent  with  representations  created  earlier.  In  both  cases,  the  representations  being 
compared  are  not  alternatives  in  the  strict  sense  but  are  different  representations  of  the  domain,  which 
have  to  include  some  elements  that  are  consistent.  These  instances  were  captured  by  an  observer  query 
and  observation. 

Ho3:  Alternative  representations  may  support  evaluation  and  comparison  of  design  alternatives. 

The  possibility  in  this  hypothesis  is  that  evaluation  and  comparison  of  alternative  designs  may  be 
better  supported  by  some  forms  of  representation.  The  a  priori  example  was  the  folding  or  unfolding  of  a 
specification.  This  was  not  observed  in  the  session.  However,  following  up  on  the  discussion  of  the 
previous  hypothesis,  the  implications  of  the  generalized  representation  of  polygon  became  more  evident 
as  the  particular  algorithm  to  implement  IN-POLYGON  was  written.  In  other  words,  the  difficulty  of 
implementing  the  generalization  representation  may  not  have  been  as  apparent  when  the  zone  class 
representation  was  decided. 
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Ho4:  Design  space  exploration  will  consist  of  a  fan-out  generation  and  subsequent  culling  of  artifacts. 

There  was  one  instance  that  supported  this  hypothesis.  When  the  function  HIPER-D  was  expanded 
with  the  Divide  and  Conquer  tactic,  redundant  code  was  generated  for  the  empty  set  case.  The  subsequent 
simplification  for  this  case  reduced  the  artifact.  It  was  expected  that  this  hypo^esis  would  also  be 
revealed  in  the  generation  of  different  design  representations,  not  just  detailed  specifications.  It  was  also 
expected  that  artifacts  at  a  higher  level  would  be  generated,  then  pruned  as  the  alternatives  were  evaluated 
and  compared.  Instances  of  this  were  not  observed  in  the  artifacts,  although  examples  of  this  process  are 
suggested  by  the  audio  record.  In  particular,  during  the  process  to  define  whether  a  point  was  within  a 
polygon,  several  alternative  algorithms  were  generated  and  discussed.  One  was  finally  settled  upon. 

Ho5:  Design  exploration  will  involve  an  examination  of  prior  actions  and  representations. 

There  were  several  instances  that  are  consistent  with  this  hypothesis.  Two  that  may  be  particularly 
relevant  were  instances  when  a  higher  level  object  description  was  modified  after  and  during  the  process 
of  specifying  details  at  a  lower  level.  For  example,  in  the  process  of  defining  the  function  IN-WEDGED- 
ANNULUS,  the  insight  occurred  that  the  whole  zone  did  not  have  to  be  passed,  just  the  zone  shape.  This 
required  a  rewriting  of  the  function  IN-ZONE,  which  had  already  been  generated. 

Supplementary  Analyses 

Observational  Data 

The  combination  of  real-time  observation  with  audio  recording  produced  a  substantial  data  record 
that  might  provide  the  basis  for  extensive  analyses.  Some  examples  of  possible  analyses  follow. 

Observations  were  coded  and  categorized  as  follows  and  can  be  used  for  further  analyses; 

K:  Key  incidents 
O:  Observations 
Q:  Observer  queries 
R:  Responses  to  queries 
T:  Tutor  comments  or  queries 
U:  User  comments  or  queries 

A  distribution  of  these  codes  by  type  is  shown  in  Table  1.  Fifty-eight  of  the  246  codes  were  incidents  of 
key  interest.  Most  of  the  key  incidents  (29)  were  errors  (see  Table  2).  Very  few  were  representation 
decisions,  a  key  behavior  for  the  assessment  of  the  a  priori  hypotheses.  The  low  number  means  that  either 
very  few’ decisions  were  being  made,  or  that  the  decisions  were  not  captured  with  this  observation 
procedure.  It  is  apparent  that  further  refinement  of  a  method  to  trace  the  development  and  modification  of 
the  domain  representation  would  be  useful. 

A  second  analysis  of  the  observation  data  focused  on  task  timelines  and  task  context.  The  intent  was 
to  determine  the  contextual  shifts  that  occur  in  the  software  development  process.  Studies  of  user 
interaction  with  intelligent  systems  have  pointed  to  the  disruptive  effects  of  contextual  shifts  (Malin  et  al. 
1991).  An  example  is  when  an  operator  of  a  fault  diagnosis  system  must  mentally  shift  from  thinking 
about  the  domain  to  thinking  about  how  to  use  a  computer  interface  to  control  the  system  or  obtain 
information  about  system  status.  This  disruptive  effect  of  the  interface  is  called  “wading  through  the 
interface.”  Similar  interference  may  occur  in  software  development  with  advanced  prototyping  tools.  The 
development  process  might  be  interrupted  by  a  computer  interface  and  development  environment  that 
imposes  contextual  switching  As  a  simple  example,  if  in  the  process  of  coding  an  algorithm,  users  have 
to  search  for  the  name  of  a  library,  they  may  forget  some  of  the  subsequent  logic  that  had  been  mentally 
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sketched,  especially  if  the  operations  imposed  by  the  interface  require  mental  work  and  activity  that  is 
irrelevant  to  specifics  of  the  domain. 


Table  1  —  Frequencies  of  Different  Types  of  Observations 


Type  of  Observation  Code 

Frequency 

Key  Incident 

58 

Observation 

92 

Observer  Query 

10 

Response  to  Query 

20 

Tutor  Comment/query 

38 

User  Comment/query 

28 

Table  2  —  Classification  of  Key  Incidents  Observed 


Type  of  Key  Incident 

Frequency 

Error 

29 

Rewrite/Revision 

5 

Representation  Decision 

9 

Misc. 

15 

To  assess  contextual  switching,  the  task  and  task  times  were  imported  into  commercial  project 
management  software  to  generate  graphical  timelines.  Additionally,  the  task  context  was  coded  into  one 
of  the  following  categories: 


A)  Demonstrate  tool 

B)  Understand  problem 

C)  Define  theoretical  objects 

D)  Define  attributes  of  objects 

E)  Define  functions 

F)  Refine  and  compile  specification 

G)  Perform  save/load/find  operations 


Figure  3  shows  the  timeline.  The  tasks  are  sorted  first  by  task  context  and  second  by  starting  time. 
Note  that  activity  on  the  first  day  follows  a  waterfall  pattern.  There  is  some  contextual  switching  between 
categories  C  and  D.  It’s  unlikely  that  this  switching  would  be  confusing  except  perhaps  on  the  switches 
from  D  back  to  C.  Note  that  this  pattern  was  iterated.  Activity  on  the  second  day  shifts  between  contexts 
to  a  greater  degree,  and  the  switching  is  less  systematic.  This  shifting  is  due  to  several  factors.  On  the 
second  day,  the  solution  was  becoming  complete,  which  meant  that  the  files  were  being  saved  more 
frequently  and  later  phases  of  the  development  process  (i.e.,  compile)  were  possible.  Secondly,  in 
addressing  a  difficult  aspect  of  the  challenge,  the  users  were  searching  and  exploring  options  and  moving 
out  of  a  systematic  development  pattern.  It  would  be  particularly  interesting  to  discover  the  development 
processes  that  occur  when  people  address  the  more  difficult  aspects  of  the  problem.  Finally,  as  the 
solution  evolved,  refinement  and  compilation  were  possible,  which  in  turn  led  to  examination  and 
revision  of  earlier  products. 
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Fig.  3 — ^Task  timeline  sorted  by  task  context  and  starting  times 


Fig.  3  (continued) — Task  timeline  sorted  by  task  context  and  starting  times 
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Questionnaire 

Appendix  C  presents  the  questionnaire.  Because  only  two  users  were  involved  in  this  study,  the  data 
collected  from  the  questionnaire  are  sparse.  Furthermore,  the  users  had  only  a  viewgraph  presentation  and 
brief  demonstration  of  the  tool  during  the  morning  of  the  first  day — much  less  than  the  standard  5-day 
training  normally  given  to  understand  the  tool.  However,  even  with  only  a  sample  of  two  novice  users,  the 
results  of  the  questionnaire  do  provide  information  about  suggested  changes  to  the  interface  that  may  be 
useful  for  the  developer. 

Methodology  Evaluation 

The  data  collected  can  provide  instances  that  support  the  hypotheses,  but  are  inappropriate  to  use  for 
statistical  tests  of  any  of  the  hypotheses.  For  further  development  of  the  evaluation  methodology, 
alternative  hypotheses  will  have  to  be  generated  a  priori  and  the  data  collection  technique  refined. 
Specifically,  prior  clarification  or  definition  of  instances  that  support  competing  alternatives  will  have  to 
be  made  to  avoid  confirmatory  biasing  effects. 

Observation  Data  Capture  and  Analyses 

As  illustrated  in  the  supplemental  analyses,  the  transcribed  observational  data  can  support  some 
interesting  “discovery”  analyses.  These  analyses  will  require  an  accurate  and  consistent  method  of 
recording  the  data.  While  much  of  the  data  can  be  obtained  from  an  audio  record,  and  more  from  a  video 
record  (which  was  not  done  in  this  instance),  a  real-time  recording  methodology  is  important  in  order  to 
capture  the  key  incidents.  The  real-time  record  can  also  focus  any  subsequent  transcription  of  the  tapes. 

Additional  analyses  could  be  performed  with  an  improved  data  capture  procedure.  For  example, 
analysis  of  errors  could  derive  the  causes  of  the  error.  But  this  would  required  detailed  data  on  the  error 
action  and  its  context,  information  that  could  come  from  a  video  record  and  behavioral  actions. 

Two  caveats  should  be  noted  about  the  observational  data  collected.  First,  the  amount  of  data 
collected  was  due  in  part  to  the  particular  composition  of  the  design  team:  a  tool  expert  and  two  domain 
users.  The  tool  expert  was  only  generally  familiar  with  the  domain;  the  users  had  little  knowledge  of  the 
tool.  This  composition  facilitated  the  generation  of  a  verbal  record,  probably  more  so  than  if  a  single 
person  was  involved  and  asked  to  think  aloud  or  record  their  process.  A  major  limitation  of  this 
composition  is  that  the  tool  expert  had  two  roles  to  assume:  a  tutor  for  the  tool  and  a  software  designer. 
This  duality  would  be  mixed  in  the  data  record,  especially  the  verbal  record  and  perhaps  in  the  artifact 
record.  Such  a  mixture  would  confound  two  phases  in  the  evaluation  of  a  tool:  its  learnability  and  its 
effectiveness  in  developing  a  design  solution. 

Questionnaire 

The  questionnaire  did  not  include  a  rating  scale,  but  Userl  added  one,  after  consulting  with  the 
experimenter.  Usually  this  would  be  highly  inappropriate,  but  in  this  case,  the  users  were  instructed  to 
recommend  changes  to  the  questionnaire  instrument  itself.  Userl  employed  the  response  scale  in 
answering  each  of  the  questions  that  solicited  an  evaluation  of  a  particular  function  of  the  tool.  Userl  also 
provided  verbal  labels  for  the  units  of  the  response  scale  and  in  this  sense,  the  numeric  scale  was  simply  a 
shorthand  notation  for  the  verbal  judgments.  Rather  than  presenting  an  extended  discussion  about  the 
validity  and  merit  of  a  numeric  response  scale,  a  few  comments  will  be  made.  First,  although  the  user  felt 
comfortable  with  using  a  response  scale,  the  meaning  of  the  scale  units  (e.g.,  excellent,  good,  above 
average,...)  is  uncertain  in  the  absence  of  a  standard  or  a  set  of  definitions.  Furthermore,  in  the  absence  of 
a  response  scale,  the  respondent  may  be  inclined  to  provide  greater  detail.  For  example,  in  response  to 
question  9  (“How  well  does  the  tool  support  exploration  of  the  design  space?”),  User2's  response  that  the 
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tool  provides  “an  opportunity  but  doesn't  force  it  or  explicitly  encourage  it”  gives  some  information  about 
how  this  user  views  the  design  capability  in  the  tool. 

It  is  apparent  in  both  users’  responses  that  definitions  of  some  concepts  must  be  added.  This 
refinement  of  a  questionnaire  comes  about  through  the  normal  process  of  pilot  testing.  One  feature  of  the 
questionnaire  that  seemed  to  solicit  information  in  a  useful  manner  was  to  direct  a  set  of  three  questions 
toward  each  of  the  particular  functions  of  the  KIDS  tool.  The  first  question  in  the  set  asks  about  the 
general  value  of  the  tool  for  the  function  (e.g.,  Q5  “How  well  does  the  tool  support  algorithm  design?”) 
and  the  second  and  third  questions  solicit  details  about  the  best  features  for  this  function  and 
recommended  changes  or  additions.  Sometimes  the  answers  will  be  contradictory.  For  example,  in 
response  to  question  5,  Userl  indicated  that  one  of  the  best  features  was  the  algorithm  design  library 
while  User2  recommended  some  library-like  features. 

Finally,  it  is  also  apparent  that  clarification  of  many  of  the  users'  responses  would  be  helpful.  This 
suggests  that  a  structured  interview  would  be  more  appropriate  than  a  self-administered  questionnaire. 
Because  the  user  community  for  these  advanced  prototyping  tools  will  be  relatively  small,  a  structured 
interview  is  feasible  from  a  sampling  perspective.  And  since  many  of  the  questions  will  address  aspects 
specific  to  a  particular  tool,  prior  refinement  of  a  questionnaire  will  be  impractical.  Thus  it  is 
recommended  that  structured  post-session  interviews  be  used  rather  than  self-administered  questionnaires. 
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U:  What  is  significance  of  stuff  to  right  of  “=”? 

RCD:  Tnifiali7ation  for  the  attribute,  initial  values  when  compiled. 
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Appendix  B 

LAST  SOLUTION  FILE 


%%%  Mode:  RE;  Package:  RE;  Base:  10.;  Syntax:  Refine 
!!  in-package("RE") 

! !  in-granciniar('THEORY-GRAMMAR,  'REGROUP) 

THEORY  HIPER-D-4 

% - 

THEORY-IMPORTS  {} 

% - 

THEORY-TYPE-PARAMETERS  { } 

% - 

THEORY-TYPES 

type  CLASSIFIER  =  symbol 
type  AIR-SEA-MODE  =  symbol 

var  HIPER-D-object :  OBJECT-CLASS  subtype-of  USER-OBJECT 

var  FLATLAND  :  OBJECT-CLASS  subtype-of  HIPER-D-object 
var  contacts  :  map(flatland,  set(CONTACT))  =  {11} 
var  flatland-zones  :  map(flatland,  set(ZONE))  =  { II } 

var  CONTACT  :  OBJECT-CLASS  subtype-of  HIPER-D-object 
var  coordinates  :  map(CONTACT,  POSITION)  =  {11} 
var  classification  :  map(CONTACT,  CLASSIFIER)  =  {11} 
var  mode  :  map(CONTACT,  AIR-SEA-MODE)  ={11} 
var  contact-zones  :  map(CONTACT,  set(ZONE))  =  {11} 

var  ZONE  :  OBJECT-CLASS  subtype-of  HIPER-D-object 
var  WEAPONS-DOCTRINE  ;  OBJECT-CLASS  subtype-of  ZONE 
var  SLAVE-DOCTRINE  :  OBJECT-CLASS  subtype-of  ZONE 
varENGAGEABILITY-ZONE  :  OBJECT-CLASS  subtype-of  ZONE 
varTIGHT-ZONE  :  OBJECT-CLASS  subtype-of  ZONE 

var  WD-shape  :  mapCWEAPONS-DOCTREslE,  WEDGED- ANNULUS)  =  {11} 
var  SD-shape  :  map(SLAVE-DOCTRINE,  WEDGED- ANNULUS)  =  {11} 
var  EZ-shape  :  map(ENGAGEABILITY-ZONE,  WEDGED- ANNULUS)  =  {11} 
varTZ-shape  :  map(TIGHT-ZONE,  POLYGON)  =  {11} 
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var  WEDGED- ANNULUS  :  OBJECT-CLASS  subtype-of  HIPER-D-object 

var  inner-radius  :  map(WEDGED-ANNULUS,  integer)  =  {11} 

var  outer-radius  :  map(WEDGED-ANNULUS,  integer)  =  {11} 

var  initial-angle  :  niap(WEDGED- ANNULUS,  integer)  =  {11} 

var  final-angle  ;  map( WEDGED- ANNULUS,  integer)  =  {11} 

var  origin  :  niap(WEDGED-ANNULUS,  POSITION)  =  {il} 

var  POLYGON  :  OBJECT-CLASS  subtype-of  HIPER-D-object 
var  vertices  :  map(POLYGON,  seq(POSITION))  =  {11} 

var  POSITION  :  OBJECT-CLASS  subtype-of  HIPER-D-object 
var  x-coord  :  map(POSITION,integer)  =  {11} 
var  y-coord  :  map(POSrnON,integer)  =  {11} 

type  angle  =  integer 


% - 

THEORY-OPERATIONS 

function  HIPER-D 
( fo  :  FLATLAND  ) 

returns  ( contact-map  :  map(CONTACT,  set(ZONE)) 

I  contact-map  =  {I  c  ->  {  z  I  (z:ZONE)  z  in  all-zones(fo)  &  in-zone(  c,z)} 
I  (c:CONTACT)  c  in  contacts(fo)  I}) 

function  HIPER-D-2  (CNTKS-7:  set(CONTACT),  ZNS-9:  set(ZONE)) 
returns 

(Z-I92:  map(CONTACT,  set(ZONE)) 

IZ-192 

=  {I  C  ->  {Z I  (Z:  ZONE) 

Z  in  ZNS-9  &  IN-ZONE(C,  Z)} 

I  (C:  CONTACT)  C  in  CNTKS-7  I}) 

=  ifCNTKS-7  =  {}then{ll} 
elseif  CNTKS-7  less!  arb(CNTKS-7)  =  { } 
then  {I  C  ->  {Z I  (Z:  ZONE) 

Z  in  ZNS-9  &  IN-ZONE(C,  Z)} 

I  (C:  CONTACT)  C  in  CNTKS-7  1} 
else  let  (Y-OP-3 
:  tuple 

(set(CONTACT),  set(ZONE),  set(CONTACT),  set(ZONE)) 

=  HIPER-D-2-DECOMPOSE-USING-UNION-DESTRUCTOR 
(CNTKS-7,  ZNS-9)) 

HIPER-D-2(Y-OP-3.1,  Y-OP-3.2) 

+*  HIPER-D-2(Y-OP-3.3,  Y-OP-3.4) 

function  ALL-ZONES 
( fo  :  FLATLAND  ) :  set(ZONE) 

=  flatland-zones(fo)  union  reduce(union,  image(contact-zones,  contacts(fo))) 
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function  IN-ZONE 
(c  :  CONTACT,  z  :  ZONE) :  boolean 

=  (if  WEAPONS-DOCTRINE(z)  then  in-wedged-annulus(c,WD-shape(z)) 
elseif  SLAVE-DOCTRINE(z)  then  in-wedged-annuIus(c,SD-shape(z)) 
elseif  ENGAGEABILITY-ZONE(z)  then  in-wedged-annulus(c,EZ-shape(z)) 
elseif  TIGHT-ZONE(z)  then  in-poly gon(c,TZ-shape(z)) 
else  undefined) 

function  IN-WEDGED- ANNULUS 
(c  :  CONTACT,  z  :  WEDGED-ANNULUS  ) :  boolean 
=  (let  (p  :  POSITION  =  coordinates(c)) 

<x-coord(p),  y-coord(p)> 
in 

{  <x,y>  I  (x:integer,  y:integer,  d  :  real,  contact-angle  :  angle,  offset:  integer) 
d  =  distance(x-coord(origin(z)),  y-coord(origin(z)),x,y) 

&  inner-radius(z)  <=  d  &  d  <=  outer-radius(z) 

&  (d  >  0.0 

&  initial-angle(z)  ~=  final-angle(z) 

=>  contact-angle  =  compute-angle(x-coord(origin(z)),  y-coord(  origin(z)),x,y,  d) 
&  offset  =  (contact-angle  -  initial-angle(z))  mod  360 
&  offset  <=  (final-angle(z)  -  initial-angle(z))  mod  360) 

}) 

function  DISTANCE 

(xl  :  integer,  yl  :  integer,  x2  :  integer,  y2  :  integer)  :  real  , 

=  sqrt((x2  -  xl)  *  (x2  -  xl)  +  (y2  -  yl)  *  (y2  -  yl)) 

%  note:  we  have  rounded,  so  this  is  approximate 

function  COMPUTE- ANGLE 
(xl  :  integer,  yl  :  integer,  x2  :  integer,  y2  :  integer,  d  :  real 
I  d  =  distance(xl,yl,x2,y2)  &  d  >  0.0) :  integer 
=  round(acos(  (x2  -  xl)  /  d)) 

function  FIND-SEGMENT-INTERSECTION 
(xl  :  integer,  yl  :  integer,  x2  :  integer,  y2  :  integer, 
ul :  integer,  vl :  integer,  u2  :  integer,  v2 :  integer) 

:  tuple(real,  real,real,real) 

=  (let  (slope-xy  :  real  =  (x2  -  xl)/(y2  -  yl), 
slope-uv  :  real  =  (u2  -  ul)/(v2  -  vl)) 
let  (intercept-xy  :  real  =  yl  -  slope-xy  *  xl, 
intercept-uv  :  real  =  vl  -  slope-uv  *  ul) 
if  slope-xy  =  slope-uv 
then  (if  intercept-xy  ~=  intercept-uv 
then  undefined 

else  check-coincident-lines(xl,  yl,  x2,  y2,ul,  vl,  u2,  v2)) 
else  (let  (x  :  real  =  (intercept-uv  -  intercept-xy)/(slope-uv  -  slope-xy)) 
let  (y  :  real  =  slope-xy  *  x  +  intercept-xy) 

(if  xl  <=  x2 

then  (if  xl  <=  x  &  x  <=  x2 
then  <x,y,x,y> 
else  undefined) 
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else  (if  x2  <=  X  &  X  <=  xl 
then  <x,y,x,y> 
else  undefined)))) 

function  CHECK-COINCIDENT-LINES 
(xl  :  integer,  yl  :  integer,  x2  :  integer,  y2  :  integer, 
ul  :  integer,  vl  :  integer,  u2  :  integer,  v2  :  integer) 

:  tuple(real,  real,real,real) 
ss  undefined 

function  COUNT-INTERSECTIONS 
(cx  :  integer,  cy  :  integer,  outx  :  integer,  outy  :  integer, 
z-vertices  :  seq(POSITION),  cnt :  integer) :  integer 
=  (if  size(z-vertices)  <=  1 
then  cnt 

else  (let  (Intersect-interval :  tuple(real,  real,real,real) 

=  find-segment-intersection(cx,cy,outx,outy, 
x-coord(first(z-vertices)), 
y-coord(first(z-vertices)), 
x-coord(second(z-vertices)), 
y-coord(second(z-vertices)))) 
if  defined?(Intersect-interval) 
then  (if  Intersect-interval.  1  =  Intersect-interval.3 
&  Intersect-interval.2  =  Intersect-interval.4 
then  %  single  point  intersection 

(if  Intersect-interval.  1  =  integer-to-real(x-coord(first(  z-vertices))) 

&  Intersect-interval.2  =  integer-to-real(y-coord(first(  z-vertices))) 
then  undefined  %elaborate  later 

elseif  Intersect-interval.  1  =  integer-to-real(x-coord(se  cond(z- vertices))) 
&  Intersect-interval.2  =  integer-to-real(y-coord(se  cond(z-vertices))) 
then  (if  y-coord(first(z- vertices))  <=  cy 
then  (if  y-coord(second(z-vertices))  <=  cy 

then  COUNT-INTERSECnONS(cx,cy,outx,outy, 
rest(rest(z-verti  ces)),  cnt) 
else  COUNT-INTERSECTIONS(cx,cy,outx,outy, 
rest(rest(z-verti  ces)),  cnt  +1)) 
else  (if  y-coord(second(z-vertices))  <=  cy 

then  COUNT-INTERSECTIONS(cx,cy,outx,outy, 
rest(rest(z-verti  ces)),  cnt  +  1) 
else  COUNT-INTERSECTIONS(cx,cy,outx,outy, 
rest(rest(z-verti  ces)),  cnt))) 

else  %normal  case 

COUNT-INTERSECTIONS(cx,cy,outx,outy,  rest(z- vertices),  cnt+  1)) 
else  %  coiincident  lines 
undefined) 

else  %  no  intersection 

COUNT-INTERSECTIONS(cx,cy,outx,outy,  rest(z-vertices),  cnt))) 

function  IN-POLYGON 
(c  :  CONTACT,  z  :  POLYGON ) :  boolean 
=  (let  (p  :  POSITION  =  coordinates(c). 
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z-vertices  :  seq(POSITION)  =  vertices(z)) 
let  (cx  =  x-coord(p),  cy  =  y-coord(p), 
outx  =  -1,  outy  =  y-coord(p)) 
if  count-intersections(cx,  cy,  outx,  outy, 

z-vertices  ++  [first(z- vertices)],  0)  =  1  mod  2 

then  true 
else  false 
) 

function  MAKE-TEST-ZONE  () 

=  (let  (z  :  ZONE  =  make-object('TIGHT-ZONE), 
poly  :  POLYGON  =  make-object('POLYGON), 
pi  :  POSITION  =  make-object(’POSITION), 
p2  :  POSITION  =  make-object('POSITION), 
p3  :  POSITION  =  make-object(’POSITION), 
p4  :  POSITION  =  make-object('POSmON), 
p5  :  POSITION  =  make-object('POSITION), 
p6  :  POSITION  =  make-object('POSITION) 

) 

x-coord(pl)  <-  0;  y-coord(pl)  <-  25; 
x-coord(p2)  <-118;  y-coord(p2)  <-  62; 
x-coord(p3)  <-  259;  y-coord(p3)  <-  25; 
x-coord(p4)  <-  259;  y-coord(p4)  <-  5; 
x-coord(p5)  <-118;  y-coord(p5)  <-  32; 
x-coord(p6)  <-  0;  y-coord(p6)  <-  5; 
vertices(poly)  <-  [pl,p2,p3,p4,p5,p6]; 

TZ-shape(z)  <-  poly; 
poly) 

function  MAKE-TEST-CONTACT  () 

=  (let  (c  :  CONTACT  =  make-object('CONTACT), 
pi :  POSITION  =  make-object(’POSITION)) 
x-coord(pl)  <-  258;  y-coord(pl)  <- 183; 
coordinates(c)  <-  pi; 
c) 

% - 

THEORY-LAWS 

assert  def-of-CLASSIFIER 

fa(c  :  CLASSIFIER)  c  in  {'friendly,  'hostile,  'own,  'unknown) 


assert  def-of-AIR-SEA-MODE 

fa(asm  ;  AIR-SEA-MODE)  asm  in  {'AIR,  'SEA} 


% - 

THEORY-RULES 

% - 


THEORY-MISC-LAWS 
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% - 

THEORY-MISC-DEFS 

% - 

THEORY-MISC-RULES 

% - 

THEORY-MISC-FORMS 


end-theory 


Appendix  C 
QUESTIONNAIRE 


The  following  questions  address  the  software  development  model  in  the  tool.  These  questions  address 
several  functions  in  the  model.  Each  function  is  indicated  with  italics.  For  each  function,  please  give  a 
general  assessment,  as  well  as  a  specific  assessment  of  the  best  features  and  any  recommendations  you 
would  have  for  changes. 


Scale 

0 

1 

2 

3 

4 

5 

Rating 

Poor 

Fair 

Average 

Above 

Average 

Good 

Excellent 

Q1 .  In  general,  how  well  does  the  tool  support  the  development  of  a  theory  of  the  domain? 

What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function? 

Q2.  As  a  part  of  developing  the  domain  theory  and  specification,  how  well  does  the  tool  support 
specifying  functional  constraints  on  input/output  behavior! 

What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function? 

Q3.  Asa  part  of  developing  the  domain  theory  and  specification,  how  well  does  the  tool  support  the 
generation  of  rules  and  derivation  of  laws! 

What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function  ? 


Q4.  In  general,  how  well  does  the  tool  support  converting  a  specification  to  code? 
What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function  ? 
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Q5.  As  part  of  converting  a  specification  to  code,  how  well  does  the  tool  support  algorithm  design? 
What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function  ? 

Q6.  As  part  of  converting  a  specification  to  code,  how  well  does  the  tool  support  algorithm 
simplification? 

What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function  ? 

Q7.  As  part  of  converting  a  specification  to  code,  how  well  does  the  tool  support  partial  evaluation? 
What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function  ? 

Q8.  As  part  of  converting  a  specification  to  code,  how  well  does  the  tool  support  refinement  of  data 
types? 

What  are  the  tool's  best  features  for  this  function? 

What  changes  or  additions  would  you  suggest  for  this  function  ? 

The  next  few  questions  address  how  the  tool  works  in  supporting  software  design. 

Q9.  How  well  does  the  tool  support  exploration  of  the  design  space  (i.e.,  exploration  of  alternative 
design  solutions)? 

QIO.  How  well  does  the  tool  support  the  generation  of  alternative  designs? 

Q1 1 .  How  well  does  the  tool  support  the  evaluation  of  alternative  designs? 

Q12.  What  are  the  strengths  of  the  tool? 

Q13.  What  are  the  weaknesses  of  the  tool? 


Q14.  In  general,  what  additional  functionality  would  you  like  to  see? 
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Q15.  What  changes  would  you  suggest  to  the  interface? 

Q16.  Is  the  symbology  appropriate  in  the  interface? 

Q17.  Are  the  graphics  adequate? 

Q18.  Is  the  response  time  of  the  tool  to  user  commands  adequate? 

Q19.  Is  the  software  model  implemented  in  this  tool  consistent  with  your  concept  of  the  software 
development  process? 

Q20.  Is  the  problem  that  the  tool  was  used  on  sufficiently  complex  to  assess  the  utility  of  the  tool  for 
designing  software  for  Navy  systems? 


The  following  features  have  been  found  important  in  the  usability  of  the  interface.  Please  comment  on  the 
extent  to  which  these  features  are  present  in  the  tool,  and  elaborate  on  your  assessment. 


Feature 

Simple  and 
natural  dialogue 
Speaks  the  user’s 

language _ 

Minimizes  user 
memory  load 
Consistency 


Present 


Provides  feedback 


Provides  clearly 
marked  exits 
Provides  shortcuts 


Provides  good 
error  messages 
Designed  to 
prevent  errors 
Tolerates  errors 


Provides  for  error 

recovery _ 

Minimizes  system 
response  time 


Elaboration 


